Feature analysis for automatic speechreading
نویسندگان
چکیده
− Audio-Visual Automatic Speech Recognition systems use visual information to enhance ASR systems in clean and noisy environments. This paper compares of a number of different visual feature extraction methods. When performing visual speech recognition the visual feature vector requires a base level of detail for optimum recognition. Geometric feature extraction provides lower recognition than pixel based methods due to the loss of characteristic speech information such as f-tuck, protrusion etc. Downsampling of images reduces visual recognition scores due to the loss of detail in the images. Also, the role of dynamic features was investigated for improved recognition. It was observed that the use of static features only, provided higher recognition scores than with a feature vector of the same length containing both static and dynamic features. These results illustrate the need for a base level of detail in the feature vector for improved visual recognition scores.
منابع مشابه
Exploiting lower face symmetry in appearance-based automatic speechreading
Appearance-based visual speech feature extraction is being widely used in the automatic speechreading and audio-visual speech recognition literature. In its most common application, the discrete cosine transform (DCT) is utilized to compress the image of the speaker’s mouth region-of-interest (ROI), and the highest energy spatial frequency components are retained as visual features. Good genera...
متن کاملLinear discriminant analysis for speechreading
This paper investigates the use of Fisher-Rao linear discriminant analysis (LDA) as a means of visual feature extraction for hidden Markov model based automatic speechreading. For every video frame, a three-dimensional region of interest containing the speaker's mouth over a sequence of adjacent frames is lexicographically arranged into a data vector. Such vectors are then projected onto the sp...
متن کاملVisual feature analysis for automatic speechreading
This paper proposes a novel method of visual feature extraction for automatic speechreading. While current methods of extracting delta or difference features involves computing the difference between adjacent frames, this method proposed provides information on how the visual features evolve over a time period longer than the time period between adjacent frames, the time period being relative t...
متن کاملA hierarchy probability-based visual features extraction method for speechreading
1 This research is supported by the President Foundation of the Institute of Acoustics, Chinese Academy of Sciences (No.98-02) and “863” High Tech R&D Project of China (No. 863-306-ZD-11-1). ABSTRACT Visual feature extraction method now becomes the key technique in automatic speechreading systems. However it still remains a difficult problem due to large inter-person and intraperson appearance ...
متن کاملAutomatic Extraction of Lip Feature Points
We present a novel algorithm for the robust and reliable automatic extraction of lip feature points for speechreading. The algorithm uses a combination of colour information in the image data and knowledge about the structure of the mouth area to find certain feature points on the inner lip contour. A new confidence measure quantifying how well the feature extraction process worked is introduce...
متن کامل